home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Turnbull China Bikeride
/
Turnbull China Bikeride - Disc 1.iso
/
ARGONET
/
PD
/
FILER
/
X-FILESRC.ZIP
/
xrecover
/
Read me
< prev
next >
Wrap
Text File
|
1997-01-08
|
11KB
|
249 lines
X-RECOVER 0·09
A program to attempt recovery of an x-file's contents
X-Files is an image filing system written by Andy Armstrong. Normal
(filecore based) Risc OS filing systems are limited to 10 character
leafnames and 77 files per directory. X-Files supports long
filenames (up to 256 characters internally) and an unlimited number
of files per directory, storing files in an x-file (which in other
respects behaves like a "normal" directory).
Unfortunately X-Files is not totally robust, and can occasionally
corrupt an x-file, which it will then refuse to open. Sod's law
ensures that the data in the x-file is very important and not
backed up anywhere else. This is where x-recover comes in;
x-recover will _attempt_ to make sense of and extract data from
corrupted x-files.
Important: x-recover comes with absolutely NO WARRANTY
This program is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of
merchantability or fitness for a particular purpose. As x-recover
is written in C it almost certainly breaks the rules somewhere, and
hence is exhibiting "undefined behaviour". What this means in
English is that it could do anything at all, including, but not
limited to:
1. Working exactly as intended
2. Trashing your hard disc
3. Turning Barbara Cartland into a goth*
Usage
x-recover [options] <x-file> <destination>
x-file is the pathname of the x-file to attempt to recover.
x-recover does not check the filetype (so if x-file is not actually
an x-file x-recover will generate copious warnings as it discovers
this).
destination is the pathname a directory (or empty x-file) to write
recovered contents into. destination should exist, or use the -c
option to create a new x-file with this name. destination can be
omitted if file output is suppressed (-g or -n options).
Options
-a extract the full chunk allocated to the file, rather than the
length used. Using this option causes the chunktable to be
written out.
-c create destination as an x-file if no directory/x-file of this
name exists.
-d Output probable _d_irectories in raw binary form. See
directories.
-f write out any free space between chunks. x-recover -a -d -f -1
will output the entire x-file split into chunks, which if
concatenated in numerical order will give the original x-file.
[see size]
-g guess the location of the chunktable and the root directory.
x-recover simply runs through the file looking areas that
resemble the chunktable and the root directory. When used with
the -r and -t options this allows recovery even when the file
header is corrupted. This option suppresses all disc output.
-n no disc output. Integrity checks and errors are still reported
to the screen.
-r <offset>specify the offset of the root directory in x-file.
Useful if the header becomes corrupted - see header.
-s suppress file output, but still create the directories (and
free space) in destination. Mostly used for development
purposes to quickly check that things are working without
having to wait while files are copied, but x-recover -1 -f -s
will only extract directories and free space between
directories.
-t <offset>specify the offset of the chunktable in x-file. Useful if
the header becomes corrupted - see header.
-v verbose info. x-recover prints out more information about what it
is doing. Use more vs for more verbosity. -1Use Method 1 to attempt
to recover x-file's contents. The various methods are described
below.
-2 Use Method 2 to attempt to recover x-file's contents. Method 2 is
the current default.
The x-file structure, and how this affects the prognosis
As the x-file file contains within itself data from other files, it
must also store information about the contained files and their
data. If an x-file becomes corrupted, it is likely that some of
this housekeeping information is lost. The x-file structure is
described here - which parts survive determines if recovery is
possible, and if so the fidelity achievable.
Header
An x-file starts with a header which contains a signature, version
information and pointers to the chunktable and root directory. If
the information stored in the header becomes corrupted it is
possible to search for the chunktable using the -g option.
Chunktable
Except for the header all information in the x-file is stored in
chunks. The index containing chunk sizes and positions is stored in
the chunktable - clearly if the chunktable is missing then the
x-file is simply an amorphous lump of data (like a corrupt hard
disc). Currently x-recover doesn't have the knowledge to identify
files from inspection of contents, so if the chunktable is missing
automated recovery is not possible as x-recover cannot determine
where one file stops and the next starts. If the chunktable is
reasonably intact then Method 1 can be used to extract the contents
of each chunk as separate files, but all name, filetype and
datestamp information will be lost.
Root Directory
Information about names, filetypes, attributes and modification
time is stored in the root directory and its subdirectories. If the
root directory can be found (either from the header or using the -g
option) then Method 2 can be used to attempt recovery of the
x-file's contents and directory structure, including file type
information.
Method 1
Method 1 reads in the chunktable, and then systematically writes
out the contents of each non-free chunk as a file the destination
directory/x-file. If it suspects that the chunk represents a
directory it writes a text file describing the directory's
contents, else it copies out the raw contents with filetype Data
(FFD). All chunks can be copied out as raw contents with the -d
option. Files and directories are named as file0000 file0001 etc in
the order that they are found in the x-file body (probably not the
same order as chunktable - use -vv (very verbose) or greater to
list the chunktable).
Method 2
Method 2 reads in the chunktable and the root directory. Starting
in the root directory it attempts to create a list of all files,
their filetype, attributes, and location within the x-file
(obtained via the chunktable). It recreates this directory
structure within the destination directory/x-file, and then copies
out all the files it found with their correct filenames, restoring
filetypes, access attributes and modification times. It then tries
to tally files that it knows exist but has no location with
un-recovered chunks from the chunktable, and writes out any
successful matches to the destination. Finally it writes the
contents of any remaining unrecovered chunks as files in the
destination using Method 1. Well, that's the plan...
Directories
Like the normal Risc OS filing system, X-Files writes a special
signature at the start of all directories so that when it reads the
chunk containing the directory information, it can check that the
information is not corrupted. X-Files' signature is the string Andy
at the start of the chunk. Hence, if x-recover comes across a chunk
about which it has no information, it will have a look for this
signature. If present, the chunk is assumed to be a directory,
which means that there is a small chance that a file which starts
with the text "Andy" will interpreted as a directory and
consequently garbled. If this happens, use the -d option to disable
directory identification, and the file contents will be recovered
intact.
Size
Size of a file is stored _twice_ in an x-file:
1. The directory stores the size of a file
2. The chunktable stores the size of a chunk
As the directory also stores an index into the chunktable, it is
possible to have conflicting sizes reported for a given file. In
this case x-recover will use the larger of the size in the
directory and the size in chunktable. This could cause the
recovered file to acquire a copy of the start of the next chunk,
which can mean that _the sum of the sizes of the recovered parts is
greater than the size of the x-file_. You don't get any extra
information - you just get some of it twice!
Bugs
We don't do bugs.
None known - everything works to the design. However, there are
known design deficiencies (and planned improvements). Please report
bugs (preferably with fixes) to <bagpuss@done.net>. If you can
supply an x-file to demonstrate then this would be useful.
Currently I'm quite happy for relevant e-mail up to 1Mb, but if
bagpuss.done.net is up then use anonymous ftp to upload problem
files to ftp://bagpuss.done.net/
Files up to 100Mb are acceptable by this method. No, I'm not
confusing Kb with Mb. If you're on Janet then you should be able to
shift 100Mb to me in 7 minutes.
Design deficiencies
Unrecovered chunks are written out as file#### in the destination
directory/x-file _after_ files are recovered, overwriting any
genuine files with the same name(s). Of course, no-one names files
like this... (So if you recover one x-file into another rename
these files rapidly.)
The file#### naming system assumes that you have less than 10000
chunks. If this is violated x-recover will probably crash from
undefined behaviour as several internal buffers overflow. Don't say
that I didn't warn you - this software comes with absolutely no
warranty.
Method 2 doesn't re-attach subdirectories (and hence cannot recover
name/date/attribute information for files contained in these
directories). The chunks are recovered by Method 1. I can't just
stick them in the destination directory/x-file in case two names
clash, as the second will overwrite the first. Method 3 (when
written (when someone asks me to)) will hook unclaimed directories
into the correct parent (with names like dir_0000 ).
Likewise, Method 2 (and 3) ignore entries in the dirhash that don't
correlate with full filenames. Consider what would happen with two
files _aard_vark and _aard_wolf...
Don't be caught out by
Method 2 restores file permissions (where known). If it restores a
file as LR/ (no write access) and an attempt is made to recover the
x-file to the same directory, x-recover will not be able to open
the file again, so will recover the contents as a Method 1
unrecovered chunk.
* Black is a much nicer colour than pink
_________________________________________________________________
This document was last reviewed on Wednesday November 6th 1996
Nicholas Clark <Nicholas.Clark@Liverpool.ac.uk>